Final Project Submission 2

Author

Kaushika Potluri

Published

November 11, 2022

#Introduction The research question that I have been interested in is the impact of education about sex and fertility for women and how that changes the fertility rate. Women’s education raises the value of time spent working in the market and, as a result, the opportunity cost of spending time to take care of their child seems less. Across time and places, there is a clear negative link between women’s education and fertility, although its meaning is ambiguous. Women’s level of education may impact fertility through its effects on children’s health, the number of children desired, and women’s ability to give birth and understanding of various birth control options. Each of these are influenced by local, institutional, and national circumstances. Their relative importance may fluctuate as a society develops economically. We analyse the education–fertility relationship by using data on women from Botswana. A realistic quantification of such a relationship can be problematic for various reasons. First, factors such as motivation and ability are associated with fertility and education but cannot be observed and as a consequence cannot be included in the model.

Research Question

Does education affect the rate of fertility in Women? What other factors influence rate of fertility in Women?

Hypothesis

Note

Ho : β1 = β2 = … = βp-1 = 0

H1 : βj ≠ 0, for at least one value of j

OR

Ho : Variances equal, model is significant.

H1 : Variances not equal, model is not significant.

Some measure of access to birth control could be useful if it varied by region. Often, policy changes in the advertisement or availability of contraceptives can be found. But there is no region information(parameters) in our data set.

Loading in packages:

Code
library(readr)
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ dplyr   1.0.10
✔ tibble  3.1.8      ✔ stringr 1.5.0 
✔ tidyr   1.2.1      ✔ forcats 0.5.2 
✔ purrr   1.0.0      
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
Code
library(ggplot2)
library(dplyr)
library(readxl)
library(DataExplorer)
Error in library(DataExplorer): there is no package called 'DataExplorer'
Code
library(summarytools)
Warning: no DISPLAY variable so Tk is not available
system might not have X11 capabilities; in case of errors when using dfSummary(), set st_options(use.x11 = FALSE)

Attaching package: 'summarytools'

The following object is masked from 'package:tibble':

    view
Code
library(lmtest)
Loading required package: zoo

Attaching package: 'zoo'

The following objects are masked from 'package:base':

    as.Date, as.Date.numeric
Code
library(car)
Loading required package: carData

Attaching package: 'car'

The following object is masked from 'package:dplyr':

    recode

The following object is masked from 'package:purrr':

    some
Code
library(reshape)

Attaching package: 'reshape'

The following object is masked from 'package:dplyr':

    rename

The following objects are masked from 'package:tidyr':

    expand, smiths
Code
library(sandwich)

Reading in Data:

The data was acquired from Professor Sander’s article that he used.

Code
Womendata <-  read.csv("_data/data.csv")
Variable Description of Variables
children number of living children
education years of education
electricity equals 1 if has electricity
tv equals 1 if has tv
urban equals 1 if live in urban area
evermarr equals 1 if ever married
radio equals 1 if has radio
bicycle equals 1 if has bicycle
knowmeth Knows how to use birth control
usemeth Has ever used birth control
age age in years
firsthalf born in first half of year

Descriptive Statistics

Code
summary(Womendata)
       X           mnthborn         yearborn          age       
 Min.   :   1   Min.   : 1.000   Min.   :38.00   Min.   :15.00  
 1st Qu.:1091   1st Qu.: 3.000   1st Qu.:55.00   1st Qu.:20.00  
 Median :2181   Median : 6.000   Median :62.00   Median :26.00  
 Mean   :2181   Mean   : 6.331   Mean   :60.43   Mean   :27.41  
 3rd Qu.:3271   3rd Qu.: 9.000   3rd Qu.:68.00   3rd Qu.:33.00  
 Max.   :4361   Max.   :12.000   Max.   :73.00   Max.   :49.00  
                                                                
    electric          radio              tv             bicycle      
 Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.0000  
 Median :0.0000   Median :1.0000   Median :0.00000   Median :0.0000  
 Mean   :0.1402   Mean   :0.7018   Mean   :0.09291   Mean   :0.2758  
 3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.:1.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.0000  
 NA's   :3        NA's   :2        NA's   :2         NA's   :3       
      educ             ceb            agefbrth        children     
 Min.   : 0.000   Min.   : 0.000   Min.   :10.00   Min.   : 0.000  
 1st Qu.: 3.000   1st Qu.: 1.000   1st Qu.:17.00   1st Qu.: 0.000  
 Median : 7.000   Median : 2.000   Median :19.00   Median : 2.000  
 Mean   : 5.856   Mean   : 2.442   Mean   :19.01   Mean   : 2.268  
 3rd Qu.: 8.000   3rd Qu.: 4.000   3rd Qu.:20.00   3rd Qu.: 4.000  
 Max.   :20.000   Max.   :13.000   Max.   :38.00   Max.   :13.000  
                                   NA's   :1088                    
    knowmeth         usemeth          monthfm          yearfm     
 Min.   :0.0000   Min.   :0.0000   Min.   : 1.00   Min.   :50.00  
 1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.: 3.00   1st Qu.:72.00  
 Median :1.0000   Median :1.0000   Median : 6.00   Median :78.00  
 Mean   :0.9633   Mean   :0.5776   Mean   : 6.27   Mean   :76.91  
 3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.: 9.00   3rd Qu.:83.00  
 Max.   :1.0000   Max.   :1.0000   Max.   :12.00   Max.   :88.00  
 NA's   :7        NA's   :71       NA's   :2282    NA's   :2282   
     agefm          idlnchld          heduc            agesq       
 Min.   :10.00   Min.   : 0.000   Min.   : 0.000   Min.   : 225.0  
 1st Qu.:17.00   1st Qu.: 3.000   1st Qu.: 0.000   1st Qu.: 400.0  
 Median :20.00   Median : 4.000   Median : 6.000   Median : 676.0  
 Mean   :20.69   Mean   : 4.616   Mean   : 5.145   Mean   : 826.5  
 3rd Qu.:23.00   3rd Qu.: 6.000   3rd Qu.: 8.000   3rd Qu.:1089.0  
 Max.   :46.00   Max.   :20.000   Max.   :20.000   Max.   :2401.0  
 NA's   :2282    NA's   :120      NA's   :2405                     
     urban           urb_educ          spirit          protest      
 Min.   :0.0000   Min.   : 0.000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.: 0.000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :1.0000   Median : 0.000   Median :0.0000   Median :0.0000  
 Mean   :0.5166   Mean   : 3.469   Mean   :0.4222   Mean   :0.2277  
 3rd Qu.:1.0000   3rd Qu.: 7.000   3rd Qu.:1.0000   3rd Qu.:0.0000  
 Max.   :1.0000   Max.   :20.000   Max.   :1.0000   Max.   :1.0000  
                                                                    
    catholic         frsthalf          educ0           evermarr     
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :0.0000   Median :1.0000   Median :0.0000   Median :0.0000  
 Mean   :0.1025   Mean   :0.5405   Mean   :0.2078   Mean   :0.4767  
 3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
                                                                    
Code
str(Womendata)
'data.frame':   4361 obs. of  28 variables:
 $ X       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ mnthborn: int  5 1 7 11 5 8 7 9 12 9 ...
 $ yearborn: int  64 56 58 45 45 52 51 70 53 39 ...
 $ age     : int  24 32 30 42 43 36 37 18 34 49 ...
 $ electric: int  1 1 1 1 1 1 1 1 0 1 ...
 $ radio   : int  1 1 0 0 1 0 1 1 1 1 ...
 $ tv      : int  1 1 0 1 1 0 1 1 0 0 ...
 $ bicycle : int  1 1 0 0 1 0 1 1 0 0 ...
 $ educ    : int  12 13 5 4 11 7 16 10 5 4 ...
 $ ceb     : int  0 3 1 3 2 1 4 0 1 0 ...
 $ agefbrth: int  NA 25 27 17 24 26 20 NA 19 NA ...
 $ children: int  0 3 1 2 2 1 4 0 1 0 ...
 $ knowmeth: int  1 1 1 1 1 1 1 1 1 1 ...
 $ usemeth : int  0 1 0 0 1 1 1 1 1 0 ...
 $ monthfm : int  NA 11 6 1 3 11 5 NA 7 11 ...
 $ yearfm  : int  NA 80 83 61 66 76 78 NA 72 61 ...
 $ agefm   : int  NA 24 24 15 20 24 26 NA 18 22 ...
 $ idlnchld: int  2 3 5 3 2 4 4 4 4 4 ...
 $ heduc   : int  NA 12 7 11 14 9 17 NA 3 1 ...
 $ agesq   : int  576 1024 900 1764 1849 1296 1369 324 1156 2401 ...
 $ urban   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ urb_educ: int  12 13 5 4 11 7 16 10 5 4 ...
 $ spirit  : int  0 0 1 0 0 0 0 0 0 1 ...
 $ protest : int  0 0 0 0 1 0 0 0 1 0 ...
 $ catholic: int  0 0 0 0 0 0 1 1 0 0 ...
 $ frsthalf: int  1 1 0 0 1 0 0 0 0 0 ...
 $ educ0   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ evermarr: int  0 1 1 1 1 1 1 0 1 1 ...
Code
print(dfSummary(Womendata, varnumbers = FALSE, plain.ascii = FALSE, graph.magnif = 0.30, style = "grid", valid.col = FALSE), 
      method = 'render', table.classes = 'table-condensed')
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''
Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Warning in png(png_loc <- tempfile(fileext = ".png"), width = 150 *
graph.magnif, : unable to open connection to X11 display ''

Data Frame Summary

Womendata

Dimensions: 4361 x 28
Duplicates: 0
Variable Stats / Values Freqs (% of Valid) Graph Missing
X [integer]
Mean (sd) : 2181 (1259.1)
min ≤ med ≤ max:
1 ≤ 2181 ≤ 4361
IQR (CV) : 2180 (0.6)
4361 distinct values (Integer sequence) 0 (0.0%)
mnthborn [integer]
Mean (sd) : 6.3 (3.3)
min ≤ med ≤ max:
1 ≤ 6 ≤ 12
IQR (CV) : 6 (0.5)
12 distinct values 0 (0.0%)
yearborn [integer]
Mean (sd) : 60.4 (8.7)
min ≤ med ≤ max:
38 ≤ 62 ≤ 73
IQR (CV) : 13 (0.1)
36 distinct values 0 (0.0%)
age [integer]
Mean (sd) : 27.4 (8.7)
min ≤ med ≤ max:
15 ≤ 26 ≤ 49
IQR (CV) : 13 (0.3)
35 distinct values 0 (0.0%)
electric [integer]
Min : 0
Mean : 0.1
Max : 1
0:3747(86.0%)
1:611(14.0%)
3 (0.1%)
radio [integer]
Min : 0
Mean : 0.7
Max : 1
0:1300(29.8%)
1:3059(70.2%)
2 (0.0%)
tv [integer]
Min : 0
Mean : 0.1
Max : 1
0:3954(90.7%)
1:405(9.3%)
2 (0.0%)
bicycle [integer]
Min : 0
Mean : 0.3
Max : 1
0:3156(72.4%)
1:1202(27.6%)
3 (0.1%)
educ [integer]
Mean (sd) : 5.9 (3.9)
min ≤ med ≤ max:
0 ≤ 7 ≤ 20
IQR (CV) : 5 (0.7)
21 distinct values 0 (0.0%)
ceb [integer]
Mean (sd) : 2.4 (2.4)
min ≤ med ≤ max:
0 ≤ 2 ≤ 13
IQR (CV) : 3 (1)
14 distinct values 0 (0.0%)
agefbrth [integer]
Mean (sd) : 19 (3.1)
min ≤ med ≤ max:
10 ≤ 19 ≤ 38
IQR (CV) : 3 (0.2)
28 distinct values 1088 (24.9%)
children [integer]
Mean (sd) : 2.3 (2.2)
min ≤ med ≤ max:
0 ≤ 2 ≤ 13
IQR (CV) : 4 (1)
14 distinct values 0 (0.0%)
knowmeth [integer]
Min : 0
Mean : 1
Max : 1
0:160(3.7%)
1:4194(96.3%)
7 (0.2%)
usemeth [integer]
Min : 0
Mean : 0.6
Max : 1
0:1812(42.2%)
1:2478(57.8%)
71 (1.6%)
monthfm [integer]
Mean (sd) : 6.3 (3.6)
min ≤ med ≤ max:
1 ≤ 6 ≤ 12
IQR (CV) : 6 (0.6)
12 distinct values 2282 (52.3%)
yearfm [integer]
Mean (sd) : 76.9 (7.8)
min ≤ med ≤ max:
50 ≤ 78 ≤ 88
IQR (CV) : 11 (0.1)
38 distinct values 2282 (52.3%)
agefm [integer]
Mean (sd) : 20.7 (5)
min ≤ med ≤ max:
10 ≤ 20 ≤ 46
IQR (CV) : 6 (0.2)
35 distinct values 2282 (52.3%)
idlnchld [integer]
Mean (sd) : 4.6 (2.2)
min ≤ med ≤ max:
0 ≤ 4 ≤ 20
IQR (CV) : 3 (0.5)
19 distinct values 120 (2.8%)
heduc [integer]
Mean (sd) : 5.1 (4.8)
min ≤ med ≤ max:
0 ≤ 6 ≤ 20
IQR (CV) : 8 (0.9)
21 distinct values 2405 (55.1%)
agesq [integer]
Mean (sd) : 826.5 (526.9)
min ≤ med ≤ max:
225 ≤ 676 ≤ 2401
IQR (CV) : 689 (0.6)
35 distinct values 0 (0.0%)
urban [integer]
Min : 0
Mean : 0.5
Max : 1
0:2108(48.3%)
1:2253(51.7%)
0 (0.0%)
urb_educ [integer]
Mean (sd) : 3.5 (4.3)
min ≤ med ≤ max:
0 ≤ 0 ≤ 20
IQR (CV) : 7 (1.2)
21 distinct values 0 (0.0%)
spirit [integer]
Min : 0
Mean : 0.4
Max : 1
0:2520(57.8%)
1:1841(42.2%)
0 (0.0%)
protest [integer]
Min : 0
Mean : 0.2
Max : 1
0:3368(77.2%)
1:993(22.8%)
0 (0.0%)
catholic [integer]
Min : 0
Mean : 0.1
Max : 1
0:3914(89.8%)
1:447(10.2%)
0 (0.0%)
frsthalf [integer]
Min : 0
Mean : 0.5
Max : 1
0:2004(46.0%)
1:2357(54.0%)
0 (0.0%)
educ0 [integer]
Min : 0
Mean : 0.2
Max : 1
0:3455(79.2%)
1:906(20.8%)
0 (0.0%)
evermarr [integer]
Min : 0
Mean : 0.5
Max : 1
0:2282(52.3%)
1:2079(47.7%)
0 (0.0%)

Generated by summarytools 1.0.1 (R version 4.2.2)
2022-12-22

Code
glimpse(Womendata)
Rows: 4,361
Columns: 28
$ X        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
$ mnthborn <int> 5, 1, 7, 11, 5, 8, 7, 9, 12, 9, 6, 10, 12, 2, 1, 6, 1, 8, 4, …
$ yearborn <int> 64, 56, 58, 45, 45, 52, 51, 70, 53, 39, 46, 59, 42, 40, 53, 6…
$ age      <int> 24, 32, 30, 42, 43, 36, 37, 18, 34, 49, 42, 29, 45, 48, 35, 2…
$ electric <int> 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ radio    <int> 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ tv       <int> 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1…
$ bicycle  <int> 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0…
$ educ     <int> 12, 13, 5, 4, 11, 7, 16, 10, 5, 4, 15, 7, 0, 4, 12, 7, 7, 5, …
$ ceb      <int> 0, 3, 1, 3, 2, 1, 4, 0, 1, 0, 3, 3, 4, 10, 3, 0, 4, 2, 0, 1, …
$ agefbrth <int> NA, 25, 27, 17, 24, 26, 20, NA, 19, NA, 25, 23, 18, 19, 23, N…
$ children <int> 0, 3, 1, 2, 2, 1, 4, 0, 1, 0, 3, 3, 2, 8, 3, 0, 4, 2, 0, 1, 0…
$ knowmeth <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ usemeth  <int> 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1…
$ monthfm  <int> NA, 11, 6, 1, 3, 11, 5, NA, 7, 11, 6, 1, 1, 10, 1, NA, NA, NA…
$ yearfm   <int> NA, 80, 83, 61, 66, 76, 78, NA, 72, 61, 70, 84, 66, 66, 74, N…
$ agefm    <int> NA, 24, 24, 15, 20, 24, 26, NA, 18, 22, 24, 24, 23, 26, 21, N…
$ idlnchld <int> 2, 3, 5, 3, 2, 4, 4, 4, 4, 4, 3, 6, 6, 4, 3, 4, 5, 1, 2, 3, 2…
$ heduc    <int> NA, 12, 7, 11, 14, 9, 17, NA, 3, 1, 16, 7, NA, 3, 16, NA, NA,…
$ agesq    <int> 576, 1024, 900, 1764, 1849, 1296, 1369, 324, 1156, 2401, 1764…
$ urban    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ urb_educ <int> 12, 13, 5, 4, 11, 7, 16, 10, 5, 4, 15, 7, 0, 4, 12, 7, 7, 5, …
$ spirit   <int> 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0…
$ protest  <int> 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1…
$ catholic <int> 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0…
$ frsthalf <int> 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0…
$ educ0    <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0…
$ evermarr <int> 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1…

We can see that we have 28 variables and 4361 observations in this dataset. The dependent variable of interest - number of living children Then I will perform data manipulation to tidy the data. The variables of interest are age, yearborn, month born, urban education and many more variables that seem intriguing. Variables like radio, bicycle, electric can be ignored in this.

Code
head(Womendata)
  X mnthborn yearborn age electric radio tv bicycle educ ceb agefbrth children
1 1        5       64  24        1     1  1       1   12   0       NA        0
2 2        1       56  32        1     1  1       1   13   3       25        3
3 3        7       58  30        1     0  0       0    5   1       27        1
4 4       11       45  42        1     0  1       0    4   3       17        2
5 5        5       45  43        1     1  1       1   11   2       24        2
6 6        8       52  36        1     0  0       0    7   1       26        1
  knowmeth usemeth monthfm yearfm agefm idlnchld heduc agesq urban urb_educ
1        1       0      NA     NA    NA        2    NA   576     1       12
2        1       1      11     80    24        3    12  1024     1       13
3        1       0       6     83    24        5     7   900     1        5
4        1       0       1     61    15        3    11  1764     1        4
5        1       1       3     66    20        2    14  1849     1       11
6        1       1      11     76    24        4     9  1296     1        7
  spirit protest catholic frsthalf educ0 evermarr
1      0       0        0        1     0        0
2      0       0        0        1     0        1
3      1       0        0        0     0        1
4      0       0        0        0     0        1
5      0       1        0        1     0        1
6      0       0        0        0     0        1

##Tidying the data

Code
Womendata <- data.frame(Womendata)

Womendata$mnthborn <- as.factor(Womendata$mnthborn)
Womendata$age <- as.factor(Womendata$age)
Womendata$electric <- as.factor(Womendata$electric)
Womendata$radio <- as.factor(Womendata$radio)
Womendata$tv <- as.factor(Womendata$tv)
Womendata$bicycle <- as.factor(Womendata$bicycle)
Womendata$educ <- as.factor(Womendata$educ)
Womendata$children <- as.factor(Womendata$children)
Womendata$knowmeth <- as.factor(Womendata$knowmeth)

The dependent variable of interest – number of living children (children) or number of children ever born (ceb) is a count variable.

Code
table(is.na(Womendata))

 FALSE   TRUE 
111561  10547 

Here we can see that we have some missing values in our dataset. Plotting the missing values we can check if they are important or not using ‘plot_missing’ by loading library DataExplorer.

Code
plot_missing(data = Womendata, geom_label_args = list(size = 1.4), theme_config=list(axis.text=element_text(size = 6 )))
Error in plot_missing(data = Womendata, geom_label_args = list(size = 1.4), : could not find function "plot_missing"

We can see how these missing values do not cause much of an issue since these missing observations(parameters) convey less important information than the other parameters. Hence we can ignore these values.

##Removing missing values and parameters that are not required The aim is to estimate the effect of education on women’s fertility flexibly, while controlling for possible linear and non-linear effects of observable and unobservable confounding factors, and to analyse how the effect of education changes when considering different expectiles of the response variable distribution. We also include two variables regarding the knowledge about and the use of birth control methods as well as marital status.All three obviously influence the number of children. Further, we include variables indicating wealth, e.g. about the availability of electricity, a television set or a bicycle.

Excluding variables that have no significant importance from our data.

Code
Womendata <- subset(Womendata, select = -c(agefm,yearfm,monthfm,heduc))
Code
Womendatacleaned <-Womendata[complete.cases(Womendata), ]
plot_missing(data = Womendatacleaned, geom_label_args = list(size = 1.4), theme_config=list(axis.text=element_text(size = 6 )))
Error in plot_missing(data = Womendatacleaned, geom_label_args = list(size = 1.4), : could not find function "plot_missing"

Detecting outliers.

Code
Womendata %>%
  ggplot(aes(educ)) +
  geom_boxplot() +
  coord_flip()

We can see that the variable educ i.e education has some outliers. Mostly for having education of more than 15 years, but they cannot potentially affect the data set.

Code
Womendata %>%
  ggplot(aes(age)) +
  geom_boxplot() +
  coord_flip()

From above box plot, age variable has no outliers.

Code
Womendata %>%
  ggplot(aes(children)) +
  geom_boxplot() +
  coord_flip()

From the above plot we can see that the variable children does have outliers but nothing to be concerned about.

Code
Womendata %>%
  ggplot(aes(urb_educ)) +
  geom_boxplot() +
  coord_flip()

From the above plot we can see that the variable urban education does have outliers but nothing to be concerned about.

Code
Womendata %>%
  ggplot(aes(yearborn)) +
  geom_boxplot() +
  coord_flip()

From the above box plot, yearborn variable has no outliers.

Code
Womendata %>%
  ggplot(aes(mnthborn)) +
  geom_boxplot() +
  coord_flip()

From the above box plot, mnthborn variable has no outliers.

#Exploratory Data Analysis (EDA) Here variables indicating wealth, e.g. about the availability of electricity, a television set or a bicycle can say something about a women’s knowledge & use of birth control/ and its impact on how many children they have.

Code
ggplot(Womendatacleaned,aes(x=factor(children),fill=factor(tv)))+
  
geom_bar()+theme(axis.text.x = element_text(face="bold", size=15),axis.text.y = element_text(face="bold", size=15))+
  
labs(
    title = "Number of Children based on if they own a Tv or not",
    x = "Number of children",
    y = "Count",size=15) +
   
scale_fill_manual(
    name = "Access to Telivision or not",
    breaks = c("0", "1"),
    labels = c("No Telivision", "Owns/ has access to a telivision"),
    values = c("0" = "orange", "1"="yellow")
  )

We can see that most mothers do not own a TV here.

Code
ggplot(Womendatacleaned,aes(x=factor(children),fill=factor(radio)))+
  
geom_bar()+theme(axis.text.x = element_text(face="bold", size=15),axis.text.y = element_text(face="bold", size=15))+
  
labs(
    title = "Number of Children based on if they own a radio or not",
    x = "Number of children",
    y = "Count",size=15) +
   
scale_fill_manual(
    name = "Access to Radio or not",
    breaks = c("0", "1"),
    labels = c("No Radio", "Owns/ has access to a Radio"),
    values = c("0" = "green", "1"="pink")
  )

Code
ggplot(Womendatacleaned,aes(x=factor(children),fill=factor(electric)))+
  
geom_bar()+theme(axis.text.x = element_text(face="bold", size=15),axis.text.y = element_text(face="bold", size=15))+
  
labs(
    title = "Number of Children based on if they have electricity or not",
    x = "Number of children",
    y = "Count",size=15) +
   
scale_fill_manual(
    name = "Access to Electricity or not",
    breaks = c("0", "1"),
    labels = c("No Electricity", "Has access to Electricity"),
    values = c("0" = "pink", "1"="yellow")
  )

Here from our bar graph we can see that most mothers that have children do not have electricity.

Code
p <- Womendata %>%
  ggplot() +
  geom_bar(aes(Womendata$usemeth)) +
  ggtitle("Individuals that ever used birth control") + labs(title = "Individuals that have ever used  birth control", 
x = "No. of individuals that have ever used birth control",y = "Count")
  theme_classic()
List of 94
 $ line                      :List of 6
  ..$ colour       : chr "black"
  ..$ linewidth    : num 0.5
  ..$ linetype     : num 1
  ..$ lineend      : chr "butt"
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ rect                      :List of 5
  ..$ fill         : chr "white"
  ..$ colour       : chr "black"
  ..$ linewidth    : num 0.5
  ..$ linetype     : num 1
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ text                      :List of 11
  ..$ family       : chr ""
  ..$ face         : chr "plain"
  ..$ colour       : chr "black"
  ..$ size         : num 11
  ..$ hjust        : num 0.5
  ..$ vjust        : num 0.5
  ..$ angle        : num 0
  ..$ lineheight   : num 0.9
  ..$ margin       : 'margin' num [1:4] 0points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ title                     : NULL
 $ aspect.ratio              : NULL
 $ axis.title                : NULL
 $ axis.title.x              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 2.75points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.title.x.top          :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 0
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 2.75points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.title.x.bottom       : NULL
 $ axis.title.y              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 1
  ..$ angle        : num 90
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 2.75points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.title.y.left         : NULL
 $ axis.title.y.right        :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 0
  ..$ angle        : num -90
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 0points 2.75points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text                 :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : chr "grey30"
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.x               :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 2.2points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.x.top           :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 0
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 2.2points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.x.bottom        : NULL
 $ axis.text.y               :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 1
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 2.2points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.y.left          : NULL
 $ axis.text.y.right         :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 0
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 0points 2.2points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.ticks                :List of 6
  ..$ colour       : chr "grey20"
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ lineend      : NULL
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ axis.ticks.x              : NULL
 $ axis.ticks.x.top          : NULL
 $ axis.ticks.x.bottom       : NULL
 $ axis.ticks.y              : NULL
 $ axis.ticks.y.left         : NULL
 $ axis.ticks.y.right        : NULL
 $ axis.ticks.length         : 'simpleUnit' num 2.75points
  ..- attr(*, "unit")= int 8
 $ axis.ticks.length.x       : NULL
 $ axis.ticks.length.x.top   : NULL
 $ axis.ticks.length.x.bottom: NULL
 $ axis.ticks.length.y       : NULL
 $ axis.ticks.length.y.left  : NULL
 $ axis.ticks.length.y.right : NULL
 $ axis.line                 :List of 6
  ..$ colour       : chr "black"
  ..$ linewidth    : 'rel' num 1
  ..$ linetype     : NULL
  ..$ lineend      : NULL
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ axis.line.x               : NULL
 $ axis.line.x.top           : NULL
 $ axis.line.x.bottom        : NULL
 $ axis.line.y               : NULL
 $ axis.line.y.left          : NULL
 $ axis.line.y.right         : NULL
 $ legend.background         :List of 5
  ..$ fill         : NULL
  ..$ colour       : logi NA
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ legend.margin             : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
  ..- attr(*, "unit")= int 8
 $ legend.spacing            : 'simpleUnit' num 11points
  ..- attr(*, "unit")= int 8
 $ legend.spacing.x          : NULL
 $ legend.spacing.y          : NULL
 $ legend.key                : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ legend.key.size           : 'simpleUnit' num 1.2lines
  ..- attr(*, "unit")= int 3
 $ legend.key.height         : NULL
 $ legend.key.width          : NULL
 $ legend.text               :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ legend.text.align         : NULL
 $ legend.title              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 0
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ legend.title.align        : NULL
 $ legend.position           : chr "right"
 $ legend.direction          : NULL
 $ legend.justification      : chr "center"
 $ legend.box                : NULL
 $ legend.box.just           : NULL
 $ legend.box.margin         : 'margin' num [1:4] 0cm 0cm 0cm 0cm
  ..- attr(*, "unit")= int 1
 $ legend.box.background     : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ legend.box.spacing        : 'simpleUnit' num 11points
  ..- attr(*, "unit")= int 8
 $ panel.background          :List of 5
  ..$ fill         : chr "white"
  ..$ colour       : logi NA
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ panel.border              : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ panel.spacing             : 'simpleUnit' num 5.5points
  ..- attr(*, "unit")= int 8
 $ panel.spacing.x           : NULL
 $ panel.spacing.y           : NULL
 $ panel.grid                :List of 6
  ..$ colour       : chr "grey92"
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ lineend      : NULL
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ panel.grid.major          : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ panel.grid.minor          : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ panel.grid.major.x        : NULL
 $ panel.grid.major.y        : NULL
 $ panel.grid.minor.x        : NULL
 $ panel.grid.minor.y        : NULL
 $ panel.ontop               : logi FALSE
 $ plot.background           :List of 5
  ..$ fill         : NULL
  ..$ colour       : chr "white"
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ plot.title                :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 1.2
  ..$ hjust        : num 0
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 5.5points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.title.position       : chr "panel"
 $ plot.subtitle             :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 0
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 5.5points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.caption              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : num 1
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 5.5points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.caption.position     : chr "panel"
 $ plot.tag                  :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 1.2
  ..$ hjust        : num 0.5
  ..$ vjust        : num 0.5
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.tag.position         : chr "topleft"
 $ plot.margin               : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
  ..- attr(*, "unit")= int 8
 $ strip.background          :List of 5
  ..$ fill         : chr "white"
  ..$ colour       : chr "black"
  ..$ linewidth    : 'rel' num 2
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ strip.background.x        : NULL
 $ strip.background.y        : NULL
 $ strip.clip                : chr "inherit"
 $ strip.placement           : chr "inside"
 $ strip.text                :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : chr "grey10"
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 4.4points 4.4points 4.4points 4.4points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ strip.text.x              : NULL
 $ strip.text.y              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : num -90
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ strip.switch.pad.grid     : 'simpleUnit' num 2.75points
  ..- attr(*, "unit")= int 8
 $ strip.switch.pad.wrap     : 'simpleUnit' num 2.75points
  ..- attr(*, "unit")= int 8
 $ strip.text.y.left         :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : num 90
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 - attr(*, "class")= chr [1:2] "theme" "gg"
 - attr(*, "complete")= logi TRUE
 - attr(*, "validate")= logi TRUE
Code
p
Warning: Use of `Womendata$usemeth` is discouraged.
ℹ Use `usemeth` instead.
Warning: Removed 71 rows containing non-finite values (`stat_count()`).

Majority of women have used birth control atleast once in their life.

Code
k <- Womendata %>%
  ggplot() +
  geom_bar(aes(Womendata$knowmeth)) +
  ggtitle("Individuals that know about birth control") + labs(title = "Individual knows about birth control", 
x = "No. of individuals that know about birth control",y = "Count")
  theme_classic()
List of 94
 $ line                      :List of 6
  ..$ colour       : chr "black"
  ..$ linewidth    : num 0.5
  ..$ linetype     : num 1
  ..$ lineend      : chr "butt"
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ rect                      :List of 5
  ..$ fill         : chr "white"
  ..$ colour       : chr "black"
  ..$ linewidth    : num 0.5
  ..$ linetype     : num 1
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ text                      :List of 11
  ..$ family       : chr ""
  ..$ face         : chr "plain"
  ..$ colour       : chr "black"
  ..$ size         : num 11
  ..$ hjust        : num 0.5
  ..$ vjust        : num 0.5
  ..$ angle        : num 0
  ..$ lineheight   : num 0.9
  ..$ margin       : 'margin' num [1:4] 0points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ title                     : NULL
 $ aspect.ratio              : NULL
 $ axis.title                : NULL
 $ axis.title.x              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 2.75points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.title.x.top          :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 0
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 2.75points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.title.x.bottom       : NULL
 $ axis.title.y              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 1
  ..$ angle        : num 90
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 2.75points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.title.y.left         : NULL
 $ axis.title.y.right        :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 0
  ..$ angle        : num -90
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 0points 2.75points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text                 :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : chr "grey30"
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.x               :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 2.2points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.x.top           :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : num 0
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 2.2points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.x.bottom        : NULL
 $ axis.text.y               :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 1
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 2.2points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.text.y.left          : NULL
 $ axis.text.y.right         :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 0
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 0points 2.2points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ axis.ticks                :List of 6
  ..$ colour       : chr "grey20"
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ lineend      : NULL
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ axis.ticks.x              : NULL
 $ axis.ticks.x.top          : NULL
 $ axis.ticks.x.bottom       : NULL
 $ axis.ticks.y              : NULL
 $ axis.ticks.y.left         : NULL
 $ axis.ticks.y.right        : NULL
 $ axis.ticks.length         : 'simpleUnit' num 2.75points
  ..- attr(*, "unit")= int 8
 $ axis.ticks.length.x       : NULL
 $ axis.ticks.length.x.top   : NULL
 $ axis.ticks.length.x.bottom: NULL
 $ axis.ticks.length.y       : NULL
 $ axis.ticks.length.y.left  : NULL
 $ axis.ticks.length.y.right : NULL
 $ axis.line                 :List of 6
  ..$ colour       : chr "black"
  ..$ linewidth    : 'rel' num 1
  ..$ linetype     : NULL
  ..$ lineend      : NULL
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ axis.line.x               : NULL
 $ axis.line.x.top           : NULL
 $ axis.line.x.bottom        : NULL
 $ axis.line.y               : NULL
 $ axis.line.y.left          : NULL
 $ axis.line.y.right         : NULL
 $ legend.background         :List of 5
  ..$ fill         : NULL
  ..$ colour       : logi NA
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ legend.margin             : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
  ..- attr(*, "unit")= int 8
 $ legend.spacing            : 'simpleUnit' num 11points
  ..- attr(*, "unit")= int 8
 $ legend.spacing.x          : NULL
 $ legend.spacing.y          : NULL
 $ legend.key                : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ legend.key.size           : 'simpleUnit' num 1.2lines
  ..- attr(*, "unit")= int 3
 $ legend.key.height         : NULL
 $ legend.key.width          : NULL
 $ legend.text               :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ legend.text.align         : NULL
 $ legend.title              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 0
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ legend.title.align        : NULL
 $ legend.position           : chr "right"
 $ legend.direction          : NULL
 $ legend.justification      : chr "center"
 $ legend.box                : NULL
 $ legend.box.just           : NULL
 $ legend.box.margin         : 'margin' num [1:4] 0cm 0cm 0cm 0cm
  ..- attr(*, "unit")= int 1
 $ legend.box.background     : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ legend.box.spacing        : 'simpleUnit' num 11points
  ..- attr(*, "unit")= int 8
 $ panel.background          :List of 5
  ..$ fill         : chr "white"
  ..$ colour       : logi NA
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ panel.border              : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ panel.spacing             : 'simpleUnit' num 5.5points
  ..- attr(*, "unit")= int 8
 $ panel.spacing.x           : NULL
 $ panel.spacing.y           : NULL
 $ panel.grid                :List of 6
  ..$ colour       : chr "grey92"
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ lineend      : NULL
  ..$ arrow        : logi FALSE
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_line" "element"
 $ panel.grid.major          : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ panel.grid.minor          : list()
  ..- attr(*, "class")= chr [1:2] "element_blank" "element"
 $ panel.grid.major.x        : NULL
 $ panel.grid.major.y        : NULL
 $ panel.grid.minor.x        : NULL
 $ panel.grid.minor.y        : NULL
 $ panel.ontop               : logi FALSE
 $ plot.background           :List of 5
  ..$ fill         : NULL
  ..$ colour       : chr "white"
  ..$ linewidth    : NULL
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ plot.title                :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 1.2
  ..$ hjust        : num 0
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 5.5points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.title.position       : chr "panel"
 $ plot.subtitle             :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : num 0
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 0points 0points 5.5points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.caption              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : num 1
  ..$ vjust        : num 1
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 5.5points 0points 0points 0points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.caption.position     : chr "panel"
 $ plot.tag                  :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : 'rel' num 1.2
  ..$ hjust        : num 0.5
  ..$ vjust        : num 0.5
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ plot.tag.position         : chr "topleft"
 $ plot.margin               : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
  ..- attr(*, "unit")= int 8
 $ strip.background          :List of 5
  ..$ fill         : chr "white"
  ..$ colour       : chr "black"
  ..$ linewidth    : 'rel' num 2
  ..$ linetype     : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_rect" "element"
 $ strip.background.x        : NULL
 $ strip.background.y        : NULL
 $ strip.clip                : chr "inherit"
 $ strip.placement           : chr "inside"
 $ strip.text                :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : chr "grey10"
  ..$ size         : 'rel' num 0.8
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : 'margin' num [1:4] 4.4points 4.4points 4.4points 4.4points
  .. ..- attr(*, "unit")= int 8
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ strip.text.x              : NULL
 $ strip.text.y              :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : num -90
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 $ strip.switch.pad.grid     : 'simpleUnit' num 2.75points
  ..- attr(*, "unit")= int 8
 $ strip.switch.pad.wrap     : 'simpleUnit' num 2.75points
  ..- attr(*, "unit")= int 8
 $ strip.text.y.left         :List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : NULL
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : num 90
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi TRUE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 - attr(*, "class")= chr [1:2] "theme" "gg"
 - attr(*, "complete")= logi TRUE
 - attr(*, "validate")= logi TRUE
Code
k
Warning: Use of `Womendata$knowmeth` is discouraged.
ℹ Use `knowmeth` instead.

Here, we can see that most individuals know about birth control.

Code
ggplot(data = Womendata,
       aes(
         x = children,
         y = prop.table(stat(count)),
         fill = factor(usemeth), width = 1,
         label = scales::percent(prop.table(stat(count)))
       )) +
  geom_bar(position = position_dodge()) +
  geom_text(
    stat = "count",
    position = position_dodge(0.2),
    vjust = -1,
    size = 1.5
  ) + scale_y_continuous(labels = scales::percent) +
  labs(title = "Number of children based on birth control",
       x = "Number of Children",
       y = "Count") +
  theme_classic() +
  scale_fill_discrete(
    name = "Birth Control",
    labels = c("Use birth control", "Never used birth control")
  )
Warning: `stat(count)` was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(count)` instead.

Here we can see that the number of individuals using birth control is higher than individuals that never used birth control only for individuals have zero children. As the number of children go up we can see that individuals most individuals that have never used birth control are higher than the indivuals that have used birth control. This can imply that the percentage individuals that have children and use birth control are lesser than the percentage individuals that have children and never used birth control.

Code
ggplot(Womendatacleaned,aes(x=factor(children),fill=factor(evermarr)))+
  
geom_bar()+theme(axis.text.x = element_text(face="bold", size=15),axis.text.y = element_text(face="bold", size=15))+
  
labs(
    title = "Number of Children based on Marriage status",
    x = "Number of children",
    y = "Count",size=15) +
   
scale_fill_manual(
    name = "Married or not",
    breaks = c("0", "1"),
    labels = c("Not Married", "Married"),
    values = c("0" = "blue", "1"="red")
  )

Most women have 1 child in majority. Majority of those mothers are not married. This could say something about our data.

Code
Womendata %>%
  ggplot() +
  geom_bar(aes(educ)) +
  theme_classic() + labs(title = "Number of children based on years of schooling",
       x = "Number of years of schooling",
       y = "Count")

Code
  ggtitle("No of indviduals educated")
$title
[1] "No of indviduals educated"

attr(,"class")
[1] "labels"

From the bar graph we can say that majority of women in our data have education of atleast 5-7 years and the next highest is women with 0 years of education. This can say a lot about our data when we are talking about the relationship between education and fertility.

Code
ggplot(Womendatacleaned,aes(x=factor(age),fill=factor(usemeth)))+
  
geom_bar()+theme(axis.text.x = element_text(face="bold", size=5),axis.text.y = element_text(face="bold", size=15))+
  
labs(
    title = "Number of individuals that have used birth control based their age",
    x = "Age",
    y = "Count",size=15) +
   
scale_fill_manual(
    name = "Use birth control or not",
    breaks = c("0", "1"),
    labels = c("Never used birth control", "Has used birth control"),
    values = c("0" = "brown", "1"="green")
  )

Code
ggplot(data = Womendata, aes(x=mnthborn, y= children)) + 
  geom_boxplot(outlier.color = "red", outlier.shape = 1, show.legend = T) + 
  facet_wrap(~mnthborn)

Code
ggplot(data = Womendata) + 
  geom_violin(mapping = aes(y=children, x = educ,fill=educ), trim = TRUE, draw_quantiles = c(0.25, 0.5, 0.75))
Warning: Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.
Groups with fewer than two data points have been dropped.

Code
plot_bar(data = Womendata)
Error in plot_bar(data = Womendata): could not find function "plot_bar"
Code
Womendata$educ <-as.integer(as.character(Womendata$educ))
Womendata$age <- as.integer(as.character(Womendata$age))

Womendata$mnthborn <- as.integer(as.character(Womendata$mnthborn))
Womendata$electric <- as.integer(as.character(Womendata$electric))

Womendata$radio <- as.integer(as.character(Womendata$radio))
Womendata$tv <- as.integer(as.character(Womendata$tv))

Womendata$bicycle <- as.integer(as.character(Womendata$bicycle))
Womendata$children <- as.integer(as.character(Womendata$children))

Womendata$knowmeth <- as.integer(as.character(Womendata$knowmeth))
plot_histogram(data = Womendata)
Error in plot_histogram(data = Womendata): could not find function "plot_histogram"

Corelation

Code
library(corrplot)
corrplot 0.92 loaded
Code
library(RColorBrewer)

M <-cor(Womendata %>% 
         dplyr::select(age, yearborn, educ, ceb, agefbrth, children, usemeth, knowmeth))

corrplot(M, type="upper", order = "original",col=brewer.pal(n=8, name="RdYlBu"))

Code
cor(Womendata$educ, Womendata$children)
[1] -0.3705226

Here we can see that education and number of children have a negative correlation. Negative correlation is a relationship between two variables in which one variable increases as the other decreases, and vice versa.

Code
cor(Womendata$educ, Womendata$ceb)
[1] -0.3842877

Here we can see that education and number of children ever born have a negative correlation as well. This does say lot about education and number of children.

Code
cor(Womendata$age, unclass(Womendata$educ))
[1] -0.3096017

We can see that age and education have a negative correlation.

Code
cor(Womendata$age, Womendata$children)
[1] 0.7325709

Age and number of children have a positive correlation.

Code
library(PerformanceAnalytics)
Error in library(PerformanceAnalytics): there is no package called 'PerformanceAnalytics'
Code
chart.Correlation(Womendata %>% 
              dplyr::select(age,yearborn, educ, ceb, agefbrth, children), histogram=TRUE, pch=19)
Error in chart.Correlation(Womendata %>% dplyr::select(age, yearborn, : could not find function "chart.Correlation"
Code
Womendata$educ <- as.factor(Womendata$educ)
Womendata$age <- as.factor(Womendata$age)

Womendata$mnthborn <- as.factor(Womendata$mnthborn)
Womendata$electric <- as.factor(Womendata$electric)

Womendata$radio <- as.factor(Womendata$radio)
Womendata$tv <- as.factor(Womendata$tv)

Womendata$bicycle <- as.factor(Womendata$bicycle)
Womendata$children <- as.factor(Womendata$children)

Womendata$knowmeth <- as.factor(Womendata$knowmeth)
Womendata$catholic <- as.factor(Womendata$catholic)

Womendata$frsthalf <- as.factor(Womendata$frsthalf)
Womendata$educ0 <- as.factor(Womendata$educ0)

Womendata$evermarr <- as.factor(Womendata$evermarr)
Womendata$protest <- as.factor(Womendata$protest)
Womendata$spirit <- as.factor(Womendata$spirit)

Womendata$urban <- as.factor(Womendata$urban)
Womendata$spirit <- as.factor(Womendata$spirit)
Code
summary(Womendata)
       X           mnthborn       yearborn          age       electric   
 Min.   :   1   6      : 623   Min.   :38.00   18     : 243   0   :3747  
 1st Qu.:1091   3      : 406   1st Qu.:55.00   20     : 219   1   : 611  
 Median :2181   9      : 382   Median :62.00   22     : 206   NA's:   3  
 Mean   :2181   1      : 380   Mean   :60.43   16     : 205              
 3rd Qu.:3271   8      : 363   3rd Qu.:68.00   19     : 205              
 Max.   :4361   7      : 358   Max.   :73.00   26     : 201              
                (Other):1849                   (Other):3082              
  radio         tv       bicycle          educ           ceb        
 0   :1300   0   :3954   0   :3156   7      :1162   Min.   : 0.000  
 1   :3059   1   : 405   1   :1202   0      : 906   1st Qu.: 1.000  
 NA's:   2   NA's:   2   NA's:   3   10     : 527   Median : 2.000  
                                     6      : 298   Mean   : 2.442  
                                     5      : 234   3rd Qu.: 4.000  
                                     9      : 232   Max.   :13.000  
                                     (Other):1002                   
    agefbrth        children    knowmeth       usemeth          idlnchld     
 Min.   :10.00   0      :1132   0   : 160   Min.   :0.0000   Min.   : 0.000  
 1st Qu.:17.00   1      : 907   1   :4194   1st Qu.:0.0000   1st Qu.: 3.000  
 Median :19.00   2      : 696   NA's:   7   Median :1.0000   Median : 4.000  
 Mean   :19.01   3      : 528               Mean   :0.5776   Mean   : 4.616  
 3rd Qu.:20.00   4      : 392               3rd Qu.:1.0000   3rd Qu.: 6.000  
 Max.   :38.00   5      : 255               Max.   :1.0000   Max.   :20.000  
 NA's   :1088    (Other): 451               NA's   :71       NA's   :120     
     agesq        urban       urb_educ      spirit   protest  catholic frsthalf
 Min.   : 225.0   0:2108   Min.   : 0.000   0:2520   0:3368   0:3914   0:2004  
 1st Qu.: 400.0   1:2253   1st Qu.: 0.000   1:1841   1: 993   1: 447   1:2357  
 Median : 676.0            Median : 0.000                                      
 Mean   : 826.5            Mean   : 3.469                                      
 3rd Qu.:1089.0            3rd Qu.: 7.000                                      
 Max.   :2401.0            Max.   :20.000                                      
                                                                               
 educ0    evermarr
 0:3455   0:2282  
 1: 906   1:2079  
                  
                  
                  
                  
                  
Code
Womendata$age <- unclass(Womendata$age)
Womendata$children <- unclass(Womendata$children)
Womendata$educ <- unclass(Womendata$educ)

Regression Models :

Code
model1 <- lm(children ~ educ, data = Womendata)
summary(model1)

Call:
lm(formula = children ~ educ, data = Womendata)

Residuals:
   Min     1Q Median     3Q    Max 
-3.495 -1.496 -0.399  1.182  9.505 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.70519    0.06289   74.81   <2e-16 ***
educ        -0.20965    0.00796  -26.34   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.064 on 4359 degrees of freedom
Multiple R-squared:  0.1373,    Adjusted R-squared:  0.1371 
F-statistic: 693.7 on 1 and 4359 DF,  p-value: < 2.2e-16
Code
model2 <- lm(children ~., data = Womendata)
summary(model1)

Call:
lm(formula = children ~ educ, data = Womendata)

Residuals:
   Min     1Q Median     3Q    Max 
-3.495 -1.496 -0.399  1.182  9.505 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.70519    0.06289   74.81   <2e-16 ***
educ        -0.20965    0.00796  -26.34   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.064 on 4359 degrees of freedom
Multiple R-squared:  0.1373,    Adjusted R-squared:  0.1371 
F-statistic: 693.7 on 1 and 4359 DF,  p-value: < 2.2e-16
Code
model3 <- lm(children ~ educ + age + mnthborn + bicycle + urb_educ + evermarr + yearborn + radio + agefbrth +idlnchld + ceb, data = Womendata)
summary(model3)

Call:
lm(formula = children ~ educ + age + mnthborn + bicycle + urb_educ + 
    evermarr + yearborn + radio + agefbrth + idlnchld + ceb, 
    data = Womendata)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.0221 -0.0315  0.0745  0.2482  1.2733 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.0207133  3.1695672  -0.322   0.7474    
educ        -0.0001782  0.0032429  -0.055   0.9562    
age          0.0248719  0.0427747   0.581   0.5610    
mnthborn2   -0.0238081  0.0456270  -0.522   0.6018    
mnthborn3    0.0671309  0.0426656   1.573   0.1157    
mnthborn4    0.0529723  0.0451366   1.174   0.2406    
mnthborn5    0.0374369  0.0462253   0.810   0.4181    
mnthborn6    0.0572035  0.0388489   1.472   0.1410    
mnthborn7    0.0378128  0.0439550   0.860   0.3897    
mnthborn8    0.0419064  0.0450599   0.930   0.3524    
mnthborn9    0.0501796  0.0453065   1.108   0.2681    
mnthborn10   0.0197417  0.0503030   0.392   0.6947    
mnthborn11   0.0135532  0.0561693   0.241   0.8093    
mnthborn12   0.0537444  0.0613665   0.876   0.3812    
bicycle1     0.0488972  0.0211513   2.312   0.0209 *  
urb_educ     0.0019654  0.0028657   0.686   0.4929    
evermarr1    0.0428377  0.0208819   2.051   0.0403 *  
yearborn     0.0269700  0.0428099   0.630   0.5287    
radio1       0.0220766  0.0212752   1.038   0.2995    
agefbrth     0.0054808  0.0035900   1.527   0.1269    
idlnchld    -0.0083450  0.0044706  -1.867   0.0620 .  
ceb          0.9004369  0.0067066 134.262   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5175 on 3155 degrees of freedom
  (1184 observations deleted due to missingness)
Multiple R-squared:  0.9368,    Adjusted R-squared:  0.9363 
F-statistic:  2226 on 21 and 3155 DF,  p-value: < 2.2e-16

With an adjusted R-squared of 0.9363 model 3 best fits our data.

Code
library(MASS)

Attaching package: 'MASS'
The following object is masked from 'package:dplyr':

    select
Code
model4<- lm(log(ceb)~., 
                data = na.omit(Womendata))

#Using stepAIC search method for feature selection to simplify model without impacting much on the performance.
step.model <- stepAIC(model4,direction = "both",trace = FALSE)

summary(step.model)

Call:
lm(formula = log(ceb) ~ yearborn + age + electric + radio + educ + 
    agefbrth + children + usemeth + agesq + urban + evermarr, 
    data = na.omit(Womendata))

Residuals:
     Min       1Q   Median       3Q      Max 
-1.11133 -0.14413  0.01042  0.13234  1.31158 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.109e+00  8.387e-01   2.515  0.01197 *  
yearborn    -2.948e-02  1.135e-02  -2.597  0.00946 ** 
age          7.227e-02  1.212e-02   5.962 2.77e-09 ***
electric1   -3.135e-02  1.330e-02  -2.357  0.01851 *  
radio1      -1.997e-02  9.531e-03  -2.096  0.03620 *  
educ        -3.248e-03  1.270e-03  -2.558  0.01059 *  
agefbrth    -2.072e-02  1.604e-03 -12.913  < 2e-16 ***
children     2.550e-01  3.157e-03  80.773  < 2e-16 ***
usemeth      4.749e-02  9.875e-03   4.809 1.59e-06 ***
agesq       -1.312e-03  6.351e-05 -20.657  < 2e-16 ***
urban1      -2.796e-02  9.072e-03  -3.083  0.00207 ** 
evermarr1    5.743e-02  9.546e-03   6.016 2.00e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2328 on 3106 degrees of freedom
Multiple R-squared:  0.8895,    Adjusted R-squared:  0.8891 
F-statistic:  2272 on 11 and 3106 DF,  p-value: < 2.2e-16

Interpreting the results of Linear Regression ( Ordinary Least Squares(ols))

Code
ols <- lm(model3, data = Womendata)
coeftest(ols, vcov. = vcovHC, type = 'HC0')

t test of coefficients:

               Estimate  Std. Error t value Pr(>|t|)    
(Intercept) -1.02071335  3.07872730 -0.3315  0.74026    
educ        -0.00017819  0.00298008 -0.0598  0.95232    
age          0.02487191  0.04161323  0.5977  0.55009    
mnthborn2   -0.02380807  0.05235867 -0.4547  0.64935    
mnthborn3    0.06713094  0.04411010  1.5219  0.12814    
mnthborn4    0.05297229  0.05018473  1.0555  0.29126    
mnthborn5    0.03743693  0.05071823  0.7381  0.46049    
mnthborn6    0.05720345  0.04422397  1.2935  0.19593    
mnthborn7    0.03781281  0.04615904  0.8192  0.41274    
mnthborn8    0.04190639  0.04450873  0.9415  0.34650    
mnthborn9    0.05017962  0.04424976  1.1340  0.25688    
mnthborn10   0.01974173  0.05377659  0.3671  0.71356    
mnthborn11   0.01355316  0.05443757  0.2490  0.80340    
mnthborn12   0.05374445  0.06219766  0.8641  0.38760    
bicycle1     0.04889722  0.01999543  2.4454  0.01452 *  
urb_educ     0.00196540  0.00225881  0.8701  0.38431    
evermarr1    0.04283771  0.02093188  2.0465  0.04079 *  
yearborn     0.02696998  0.04159905  0.6483  0.51682    
radio1       0.02207657  0.02200372  1.0033  0.31579    
agefbrth     0.00548083  0.00370002  1.4813  0.13863    
idlnchld    -0.00834501  0.00567801 -1.4697  0.14174    
ceb          0.90043685  0.00948091 94.9737  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model Evaluation

Code
par(mfrow=c(2,2)) 

plot(model1)

Code
par(mfrow=c(2,2)) 

plot(model2)

Code
par(mfrow=c(2,2)) 

plot(model3)

Code
par(mfrow=c(2,2)) 

plot(model4)

R^2 = 0.8895 and adjusted R^2 = 0.8891, F test value = 2272 p-value = 0.001. Under normal distribution assumption. According to Central limit teorem , every distribution approximated by a normal distribution. A normal distribution is approached very quickly as n increases, note that n is the sample size for each mean and not the number of samples If the null hypothesis is true, the above-mentioned F test statistic can be condensed (dramatically). The test statistic will be this sample variance ratio. If the null hypothesis is incorrect, we will disprove both our presumption that they were equal and the null hypothesis that the ratio was equal to 1.

##Checking for Heteroskedasticity Breusch Pagan Test for Heteroskedasticity

Ho: the variance is constant
H1: the variance is not constant
Code
bptest(ceb ~ ., data = Womendatacleaned)

    studentized Breusch-Pagan test

data:  ceb ~ .
BP = 319.48, df = 94, p-value < 2.2e-16

Ho hypothesis is rejected since the variance is not constant.

In multiple regression two or more predictor variables might be correlated with each other and situation is referred as collinearity. Multicollinearity is where collinearity exists between three or more variables even if no pair of variables has a particularly high correlation. This means that there is redundancy between predictor variables.Multicollinearity can assessed by computing a score called the variance inflation factor (or VIF), which measures how much the variance of a regression coefficient is inflated due to multicollinearity in the model. The smallest possible value of VIF is one (absence of multicollinearity). A VIF value that exceeds 5 or 10 indicates a problematic amount of collinearity.

Code
vif(step.model)
  yearborn        age   electric      radio       educ   agefbrth   children 
459.569426 524.025963   1.250912   1.102156   1.523564   1.430945   2.370392 
   usemeth      agesq      urban   evermarr 
  1.210953  59.396420   1.183061   1.273500 

As a thumb rule, since we follow that a VIF value that exceeds 5 or 10 can be a problem. This leads to a simpler model without compromising the model accuracy, which is good. So now the new model will be without yearborn.

Without Multicollinearity

Code
library(MASS)
model2<- lm(log(ceb)~ mnthborn + age +
                            electric +
                            children + knowmeth +
                            usemeth  +
                            urban + radio +
                            tv + bicycle +
                            I(as.factor(educ)) +
                            idlnchld+urb_educ +
                            protest, 
                            data = na.omit(Womendatacleaned))

step.model2 <- stepAIC(model2, 
                        direction = "both", 
                        trace = FALSE)

vif(step.model2) # Variance Inflation Factor (or VIF)
             GVIF Df GVIF^(1/(2*Df))
age      3.298838 34        1.017707
electric 1.269286  1        1.126626
children 3.353509 13        1.047639
urban    2.365418  1        1.537992
radio    1.065265  1        1.032117
idlnchld 1.215614  1        1.102549
urb_educ 2.779986  1        1.667329

Now the new VIF values are all less than 5. This is good for our model. There is no Multicollinearity.

We expect from our study, if the level of education increases, the number of children is decreasing. Also, it should be same parameters negative corelation for example between education and number of children. We found that if the level of education increases, the number of children is decreasing. In addition, same parameters negative corelation for example between education and number of children or between birth control pill and children every born / children. Also, age and education level is negative corelation. Age and number of children are positive corelation.

#Conclusion In both developed and developing countries, better-educated women have fewer children than less-educated women. However, the reasons for this are less clear, since the benefits of education extend beyond the value of women’s time. Education can reduce fertility because better-educated women earn more and may raise their children more effectively. Education also improves maternal and child health, thereby increasing a woman’s physical capacity to give birth and reducing the (economic) necessity for more children. Nevertheless, understanding modern contraception helps women control birth. Finally, higher education empowers women and includes them in household decision-making on family planning. Each mechanism is significant, depending on the individual and institutional context, but there is limited evidence on the relative importance of each one.

##References [1]The effect of women’s schooling on fertility by W Sander · 1992 [2]The Impact of Women’s Schooling on Fertility and Contraceptive Use by M Ainsworth · 1996 [3]Fertility in Botswana: The Recent Decline and Future Prospects by Naomi Rutenberg and Ian Diamond